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DETAILED ACTION 
Claim Rejections - 35 USC §112 

1 . The following is a quotation of the first paragraph of 35 U.S.C. 1 1 2: 

The specification shall contain a written description of the invention, and of the manner and process of 
making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the 
art to which it pertains, or with which it is most nearly connected, to make and use the same and shall 
set forth the best mode contemplated by the inventor of carrying out his invention, 

2. Claim 15 is rejected under 35 U.S.C. 112, first paragraph, as failing to comply 
with the enablement requirement. The claim(s) contains subject matter which was not 
described in the specification in such a way as to enable one skilled in the art to which it 
pertains, or with which it is most nearly connected, to make and/or use the invention. 

Specifically, it is not clear from the specification whether "tagged data including] 
two consecutive words" refers to bi-grams, as described on Page 1 1 or tags composed 
of several words (Page 16, i.e. FIRSTNAMELASTNAME). Examiner interpreted claim 
1 5 to refer to the latter (Page 1 6 definition). 

Claim Rejections -35 USC §102 

3. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 
States. 
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4. Claims 1-4, 8-9, 12-13, 16-25, 28-29, 32-33, 36, are rejected under 35 
U.S.C. 102(b) as being anticipate by Chou et al. (5,797,123). 

The U.S. patent of Chou et al. discloses a computer based system and hence 
necessarily includes the computer code (claims 21-36) and the apparatus (claim 17) 
necessary to implement such a system. 



Claim# 


Limitations 


Chou et al. 


1,17,21 


A computer implemented speech recognition method for 
performing Natural Language Understanding (NLU) 
functions, comprising the steps of: 

(a) converting a user utterance into a plurality of 
basic speech units, said user utterance being a sequence of 
words expressing a query or a command 

(b) matching said plurality of basic speech units 
against a plurality of combinations of items, wherein each 
item is tagged data or is a concept code 

and (c) generating a combination of items likely to 
be representative of said user utterance. 


System is implemented in context of 
sub-word speech recognition (Col. 4, 
lines 28-38 and Col. 5, lines 60-62) 
which inherently converts user's 
utterances to basic speech units 
(syllables, phonemes, etc.) 

Detected keywords are tagged with 
conceptual information once they are 
recognized (Col. 5, lines 27-30 and 
lines 57-60) and then verified at 
various stages (11, 12, 13, 14, FIG. 1) 

System produces verified sentence 
hypothesis (output of elem. 14, FIG. 1) 


2, 22 


The method of claim 1, said step (b) further comprising: 

(d) a first step of matching said plurality of basic 
speech units against a vocabulary of items to generate a first 
list of items likely to be representative of said user utterance. 


Key-phrase detection (first list of 
items) using sub- word based speech 
recognizer, which inherently compares 
basic speech items against models 
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(vocabulary of items) (Col. 5, lines 60- 
67) 


3, 23 


The method of claim 2, wherein said step (d) is performed 
using Hidden Markov Models. 


The models are HMM (Col. 5, lines 
65-67) 


4, 24 


The method of claim 2, said step (b) further comprising: 

(e) a second step of matching said first list of items 
against said plurality of combinations of items to generate 
said combination of items likely to be representative of said 
user utterance in said step (c). 


The use of anti-subword models 
during key-phrase verification 
produces a verifed key-phrase. (Col. , 
line 66 -Col. 8, lines 16) 


5, 25 


The method of claim 4, wherein said step (e) is processed 
using a conceptual language model. 


Anti-subword model is a conceptual 
language model (Col. 8, lines 17-20) 


8, 28 


The method of claim 4, wherein said step (c) is processed 
using a conceptual grammar. 


Sentence parsing is done using 
"semantic constraint information" 
(Col. 10, lines 23-25 and Col. 11, lines 
33-35), which is another term for 
conceptual grammar. 


9, 29 


The method of claim 2, further comprising: 

a training step defining said vocabulary of items of 
said step (d). 


Subword HMMs are inherently 
trained. (Col. 9, lines 30-32) 


12, 32 


The method of claim 1, further comprising: 

storing a set of prototype acoustic models obtained 
from a training phase, wherein each said acoustic model 
represents one or more possible basic speech units of an 
utterance of a word. 


Inherent to training process (Col. 9, 
lines 30-32). As it is well-known in 
the art, HMM models are stored in 
memory (Col. 12, lines 15-16) 


13,33 


The method of claim 12, further comprising: 

assigning one of said acoustic models to each said 
basic speech unit . 


Inherent to sub-word speech 
recognition, i.e. for each phoneme the 
system maintains a corresponding 
acoustic model (See Col. 4, lines 30- 
38) 


16, 36 


The method of claim 1, further comprising: 

sending said most likely combination of items to a 
function identification module to perform said user query or 
command. 


System can be used to understand and 
perform user' queries, such as actions 
taken during automobile reservations 
(Col. 3, lines 65 - Col. 4, line 5) 
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18 


A speech recognition system for performing Natural 






Language Understanding, said system comprising: 






an acoustic processor, said acoustic processor for 


Subword model recognizer in key- 




receiving a user spoken utterance and determining a string of 


phrase detector (11, FIG. 1 and Col 5, 




labels identifying a corresponding sound of said user spoken 


lines 60-63) 




utterance 






a decoder communicatively linked to said acoustic 


Key-phrase detector/verifier (11, 12, 




processor, said decoder determining a likely sequence of 


FIG. 1) and sentence 




items corresponding to said determined string 01 labels 


hypothesizer/ vernier (13,14, FIG. 1) 






use HMM models (22, FIG. 1 and Col 






5, lines 60-63) 




a conceptual pronunciation dictionary providing 


Lexicon (23, FIG. 1) 




said decoder with a pronunciation of said items 






a conceptual syntax module providing said decoder 


Key-phrase grammars, anti-sub word 




with a set of allowable combined items 


models (21,24, FIG. 1) and semantic 






information (25, FIG. 1) 




and a target function identification module 






communicatively linked to said decoder, said target function 


Output of (14, FIG. l)goes to 




identification module executing a function corresponding to 


reservation system manager, etc. (Col. 




said likely sequence of items. 


3, lines 65 - Col. 4, line 5) 


19 


The system of claim 18, wherein said decoder comprises a 


Key-phrase detector/verifier (fast 




fast acoustic match and a detailed acoustic match. 


match 11, 12, FIG. 1) and sentence 






hypothesizer/verifier (13,14, FIG. 1) 


20 


The system of claim 18, wherein said conceptual syntax 






module comprises a conceptual language model or a 


Anti-sub word models (24, 27 FIG. 1) 




conceptual grammar. 


or semantic information (25, FIG. 1) 



Claim Rejections - 35 USC § 103 
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5. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

6. Claims 6-7, 10-11, 14-15, 26-27, 30-31, 34-35, are rejected under 35 U.S.C. 
103(a) as being unpatentable over Chou et al. 

The U.S. patent of Chou et al. discloses a computer based system and hence 
necessarily includes the computer code (claims 21-36) and the apparatus (claim 17) 
necessary to implement such a system. 

As per claims 6-7, 26-27, Chou et al. do not disclose the use of n-grams. 

However, Chou et al. disclose computing confidence measures for the key- 
phrase of N words using combined likelihood ratios of all N sub-words (Col. 8, line 45- 
51), which are very similar to well-known n-gram models except that the individual 
likelihoods of subwords in Chou et al. are not conditionally dependent on the previous 
words, as it occurs in n-gram models. 

The examiner takes the official notice that the use of n-gram conceptual models 
is notoriously well-known in the art of speech recognition. In these models, probability 
of an n-gram is expressed as: 

p(s) = p(w1)p(w2|w1)p(w3|w1w2)„.p(wl|w1...wl-1) = prod_i A l(p(wi|w1...wi-1)) 
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where p (wi) is a probability derived from the training of the model on a large corpus of 
data. (See U.S. patent # 6,374,217, Col. 5, lines 17-49) 

While Chou et al.'s patent suggests that n-grams would not benefit Chou et al.'s system 
during the keyword detection stage (Col. 5, lines 35-48), the use of n-grams would not 
contradict Chou et al.'s teachings in the key-phrase verification stage. Indeed, since 
Chou et al. already teach computing probabilities (likelihoods) for key-phrases with 
multiple words, the use of n-gram probabilities would ensure removal of key-phrases 
with low combined probabilities (Col. 8, lines 44-46). In other words, key-phrase 
verifier could use n-grams to verify the key-phrase instead of finding combined 
likelihood ratios. 

Therefore, it would have obvious to one of ordinary skill in the art at the time the 
invention was made to modify Chou et al. to use n-grams derived from initial training, in 
order to compute probability (confidence measure) of each detected key-phrase by 
combining corresponding subword-level probabilities (Col. 8, lines 44-46), so as to 
improve the accuracy of the verification process and remove key-phrase hypotheses 
that have scores lower than a predetermined threshold (Col. 8, lines 52-55). The 
motivation for doing so is suggested by the combination of N word likelihood ratios in 
Chou et al., which is very similar to computing n-gram probabilities, as it is well-known 
in the art. (Col. 8, lines 44-46). 
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As per claims 10-11, 30-31 , Chou et al. do not explicitly disclose "defining said 
plurality of combinations of items of said step (c) in a training step." However, Chou et 
al. disclose using "semantic constraint information" for sentence parsing. 

However, the examiner takes the official notice that, as it is well-known in the art, 
"semantic constraint information," (a.k.a. conceptual grammar) necessarily requires the 
initial step of training, since such information (sentence structure) can only be inputted 
in the system by human operators (at least initially). 

Therefore, it would have obvious to one of ordinary skill in the art at the time the 
invention was made that Chou et al.'s method of using "semantic constraint information" 
necessarily involves initially training (inputting) said "semantic constraint information" 
into the system, so as to enable the system to later automatically use said "semantic 
constraint information" for key-phrase sentence parsing and verification. 

As per claim$14, 34, Chou et al. do not disclose that user's utterance is in the 
form of isolated data ("i.e. 'Pedro Romero', as described in Specification, page 7) 

However, Chou et al. teach parsing various types of key-phrases ("in downtown 
Chicago," "in the morning", etc. - See. Col. 5, lines 20-25) and also teach using the 
system for dialogue-based automobile reservation. Hence, the examiner takes the 
official notice that, as it is well-known in the art, dialogues with automobile reservation 
systems involve questions of type: "Please say your name..." etc, which require a reply 
in the form of isolated data. Many other examples exist in the art, such as dialogues 
with voice-mail systems, airline reservation systems, etc. 
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Therefore, it would have obvious to one of ordinary skill in the art at the time the 
invention was made to modify Chou et al. to process "user's utterance in the form of 
isolated data" in order to support various types of input required for the 
reservation/voice-mail systems. 

As per claims'! 5, 35, Chou et al. do not explicitly disclose tags containing two 
consecutive keywords. 

However, Chou et al. disclose key-phrases containing a number of consecutive 
words, i.e. "in downtown Chicago, etc." identifying local geographic area. (Col. 5, 20- 
26), which could be processed by using n-grams, such as bi-grams (as explained in 
rejection for claim 6). 

Therefore, it would have obvious to one of ordinary skill in the art at the time the 
invention was made to modify Chou et al. to use a tag such as 
LOC ALJ3 E OG RAP H I C_ARE A for identification of the key-phrase or a combination of 
tags, such as AREA+TIME ('in the morning") in order to allow the system to handle 
complex user queries by using "semantic constraint information which specifies 
permissible combinations of key-phrase tags" (Col. 10, lines 23-25) - Note: the use of 
' combinations of key-phrase tags' implies presence of at least two key-phrase tags. 

Conclusion 

7. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. 
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Bellegarda (6,374,217) teach using n-grams for semantic language modeling. 
Sukkar (6,292,778) teaches utterance verification using with subword-bases minimm 
verification error training. 

Lee at al. (5,675,706) teach sub-word based keyword detection with tagging. 
Martin (5,642,519) teach using speech recognition with tagging. 

8. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Dmitry Brant whose telephone number is (703) 305- 
8954. The examiner can normally be reached on Mon. - Fri. (8:30am - 5pm). 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Talivaldis Ivars Smits can be reached on (703) 306-301 1. The fax phone 
number for the organization where this application or proceeding is assigned is (703) 
872-9306. 

Any inquiry of a general nature or relating to the status of this application or 
proceeding should be directed to Tech Center 2600 receptionist whose telephone 
number is (703) 305- 4700. 
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